Skip navigation
Use este identificador para citar ou linkar para este item: https://repositorio.ufpe.br/handle/123456789/62494

Compartilhe esta página

Título: Assessing binarization algorithms for document images
Autor(es): BERNARDINO, Rodrigo Barros
Palavras-chave: Algoritmos de binarização; Documentos históricos; Documentos escaneados; Documentos fotografados; Smartphones; Avaliação de desempenho
Data do documento: 9-Set-2024
Editor: Universidade Federal de Pernambuco
Citação: BERNARDINO, Rodrigo Barros. Assessing binarization algorithms for document images. 2024. Tese (Doutorado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2024.
Abstract: Binarization algorithms are essential for document processing, analysis, compression, and recognition, with their performance heavily influenced by document characteristics such as paper texture and noise. This thesis introduces novel algorithms and evaluation methodologies for assessing binarization performance, focusing on image quality, processing time, and file size. Nearly 70 binarization schemes were tested on 39 historical documents and 376 mobile- captured images. To expand the analysis, the Direct Binarization approach was proposed, analysing the RGB channels of input images separately. This generated hundreds of additional images, which were used to train an automatic binarization algorithm selection tool, the Image Matcher, based solely on paper texture and the strength of the back-to-front interference. The tool demonstrated significant improvements in binarization results across various cases. Recog- nizing the growing prevalence of smartphone-captured documents, the thesis also investigated such type of documents by proposing and extensively testing three new evaluation measures: the proportion of black pixels in the binary image, a normalized Levenshtein distance, and a combined metric incorporating both. These measures facilitated a comprehensive assessment of mobile-captured images using six widely used mobile devices under varying conditions, in- cluding strobe flash settings, illumination, and positional changes. Additionally, the compressed image size (using the TIFF Group 4 compression scheme) proved to be a valuable metric for evaluating the algorithms efficiency. It has been shown that if processing time is a priority, the Michalak21a algorithm with the red channel would be preferred for this type of image, but if compression rate is a priority, Yinyang22 is a better choice. Choosing the best algorithm for a given setup using the PL measure provided a better choice when compared to using only the OCR accuracy. The thesis also significantly expanded existing datasets for document image binarization by adding 24 new historical document images with manually generated ground truth and 296 mobile-captured images.
URI: https://repositorio.ufpe.br/handle/123456789/62494
Aparece nas coleções:Teses de Doutorado - Ciência da Computação

Arquivos associados a este item:
Arquivo Descrição TamanhoFormato 
TESE Rodrigo Barros Bernardino.pdf62,73 MBAdobe PDFThumbnail
Visualizar/Abrir


Este arquivo é protegido por direitos autorais



Este item está licenciada sob uma Licença Creative Commons Creative Commons